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Abstract 

We consider testing for presence of a signal in Gaussian white noise with inten- 
sity n -1 / 2 , when the alternatives are given by smoothness ellipsoids with an L2-ball of 
(squared) radius p removed. It is known that, for a fixed Sobolev type ellipsoid Ti(/3,M) 
of smoothness j3 and size M, a squared radius p the critical separation 

rate, in the sense that the minimax error of second kind over a-tests stays asymptotically 
between and 1 strictly (Ingster [22]). In addition, Ermakov [9] found the sharp asymp- 
totics of the minimax error of second kind at the separation rate. For adaptation over 
both p and M in that context, it is known that a log log-penalty over the separation rate 
for p is necessary for a nonzero asymptotic power. Here, following an example in nonpara- 
metric estimation related to the Pinsker constant, we investigate the adaptation problem 
over the ellipsoid size M only, for fixed smoothness degree j3. It is established that the 
sharp risk asymptotics can be replicated in that adaptive setting, if p — > more slowly 
than the separation rate. The penalty for adaptation here turns out to be a sequence 
tending to infinity arbitrarily slowly. 



1 Introduction and main result 

Consider the Gaussian white noise model in sequence space, where observations are 

y.j J) • " 1 %, j = i,2,..., (i) 

with unknown, nonrandom signal / = (fj)'j^ = i, and noise variables £j which are i.i.d. N(0, 1). 
We intend to test the null hypothesis of "no signal" against nonparametric alternatives de- 
scribed as follows. For some j3 > and M > 0, let £(/3,M) be the set of sequences 



S(/3,M) = {/ = : £i 2/ 7| < M}; 

3=1 
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this might be called a Sobolev type ellipsoid with smoothness parameter f} and size parameter 
M. Consider further the complement of an open ball in the sequence space 12- if H/H2 = 
Y^jLi fj 1S the squared norm then 

B p = {/ 6 h ■ II/II2 > P\- 

Here p 1 / 2 is the radius of the open ball; by an abuse of language we call p itself the "radius" . 
We study the hypothesis testing problem 

H :f = against H a : f G E(/3, M) D B p . 

Assuming that n — > 00, implying that the noise size n" 1 / 2 tends to zero, we expect that 
for a fixed radius p, consistent a-testing in that setting is possible. More precisely, there 
exist a-tests with type II error tending to zero uniformly over the nonparametric alternative 
/ € S(/3, M) n Bp. If now the radius p = p n tends to zero as n — > 00, the problem becomes 
more difficult and if p n — > too quickly, all a-tests will have the trivial asymptotic (worst 
case) power a. According to a fundamental result of Ingster [22J there is a critical rate for 
p n , the so-called separation rate 

p n x n- 4 ^ +1 ) (2) 

at which the transition in the power behaviour occurs. More precisely, consider a (possibly 
randomized) a-test (f) n in the model ([TJ for null hypothesis Hq : / = 0, that is, a test fulfilling 
E n fi(j) n < a where E n j (•) denotes expectation in the model ([I]). For given (f> n , we define the 
worst case type II error over the alternative / G S(/3, M) n B p as 

*(0 n ,p 5( 8,M) := sup (l-E nJ <p n ). (3) 

feS(i3,M)nB p 

The search for a best a-test in this sense leads to the minimax type II error 

ir n {a,p,P,M):= inf p, /9, M). (4) 

An a-test which attains the infimum above for a given n is minimax with respect to type II 
error. Ingster 's separation rate result can now be formulated as follows: if p n x n ~ i P/{ i P+ 1 ) 
and < a < 1 then 

< liminf 7r n (a, p n , (3, M) and limsup7r ri (a, p n , (3, M) < 1 — a. 

n n 

Moreover, if p n > n -W(4/3+i) then 7r n (a, p n ,P, M) 0, and if p n < n" 4 ' 3 /^ 1 ) then 
ir n (a,p n ,(3,M) ->• 1 - a. 

These minimax rates in nonparametric testing, presented here in the simplest case of an I2- 
setting, have been extended in two ways. In the first of these, Ermakov [9] found the exact 
asymptotics of the minimax type II error 7r n (a, p, f3, M) (equivalently, of the maximin power) 
at the separation rate. The shape of that result and its derivation from an underlying Bayes- 
minimax theorem on ellipsoids exhibit an analogy to the Pinsker constant in nonparametric 
estimation. In another direction, Spokoiny |35j considered the adaptive version of the min- 
imax nonparametric testing problem, where both f3 and M are unknown, and showed that 
the rate at which p n — > has to be slowed down by a log log n-factor if nontrivial asymptotic 
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power is to be achieved. Thus an "adaptive minimax rate" was specified, analogous to Ing- 
ster's nonadaptive separation rate ([2]), where the additional log log n-factor is interpreted as 
a penalty for adaptation. However this result did not involve a sharp asymptotics of type II 
error in the sense of [9]. 

It is noteworthy that in nonparametric estimation over / 6 S(/3,M) with Z2-I0SS (as opposed 
to testing), where the risk asymptotics is given by the Pinsker constant, there is a multitude 
of results showing that adaptation is possible with neither a penalty in the rate nor in the 
constant, cf. Efromovich and Pinsker [S], Golubev [T7], |18| . Tsybakov [36 j . The present 
paper deals with the question of whether the sharp risk asymptotics for testing in the sense 
of [9] can be reproduced in an adaptive setting, in the context of a possible rate penalty for 
adaptation. 

Let us present the well known result on sharp risk asymptotics for testing in the nonadaptive 
setting. Let <3? be the distribution function of the standard normal, and for a G (0, 1) let z a 
be the upper a-quantile, such that &(z a ) = 1 — a. Write a n ^> b n (or b n <C a n ) iff b n = o(a n ), 
and a n ~ b n iff lim n a n /b n = 1. 



Proposition 1 (Ermakov f^) Suppose a G (0, 1) and that the radius p n tends to zero at the 
separation rate, more precisely 

for some constant c > 0. 

(i) For any sequence of tests 4> n satisfying E n fi(p n < a + o(l) we have 



p n , P, M) > §(z a - V / A(c,f3,M)/2) + o(l) as n -> 00, 

where 

A(c, 0, M) = Ao^M-^c 2 ^ (5) 
and Aq(P) is Ermakov's constant 

(ii) For every M > there exists a sequence of tests 4> n satisfying E n ^ n < a + o(l) such 
that 

*(^„, p n , P, M) < $(z a - y/A(c,P,M)/2) + o(l). 



This gives the sharp asymptotics for the minimax type II error at the separation rate, anal- 
ogous to the Pinsker constant [33J for nonparametric estimation. The optimal test attaining 
the bound of (ii) above, as given in [9], depends on P and M. Concerning adaptivity in both 
of these parameters, the following result is known. 
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Proposition 2 (Spokoiny 135)/ ). Let T be a subset of (0, oo) x (0, oo) such that there exist 
M > 0, /3 2 > Pi > and 

T^{(P,M) :/?! </?</3 2 }. 

(i) If tn "C (log log n) 1 / 2 and p n ~ c • (n/t n ) _4,3// ( 4/3+1 ) , £/ien /or any c > and any sequence 
of tests 4> n satisfying E n fi(p n < a + o(l), and not depending on f3or M , we have 

sup ^(<p n ,p n ,P,M) > l-a + o(l). 
(/3,A/)eT 

(iij For any /3* > 1/2 and < M ± < M 2l let 

T={(/3,M) : 1/2 < < p*,Mi < M < M 2 }. 

Then there exist a constant c\ = c\(P* , M\, M 2 ) and a sequence of tests <p n satisfying E n $<f> n = 
o(l) such that, if 

n \ -4/3/(4/3+1) 

(loglogn) 1 ^^ (7) 
then 

sup #(0 n ,/> n ,/3,M) =o(l). (8) 
(/3,M)eT 




Here the criterion to evaluate a test sequence has changed, to include the worst case type II 
error over a whole range of /3, M. Hence the critical radius rate ([7|) has to be interpreted as 
an adaptive separation rate. It differs by a factor (log log n) 2 ^ 4 ^ 1 ** from the nonadaptive 
separation rate ([2]) ; this factor is an example of the well-known phenomenon of a penalty for 
adaptation. Furthermore, as noted in [35] . a degenerate behaviour occurs here, in that both 
error probabilities at the critical rate tend to zero. Thus any sequence 4> n of tests fulfilling ([8]) 
should be seen as adaptive rate optimal, comparable to rate optimal tests in the nonadaptive 
case (that is, tests fulfilling limsup n ^(4> n , Pn, P, M) < 1 — a at p n given by ([2])). In Ingster 
and Suslina [23], chap. 7, the worst case adaptive error ([8]) is further analyzed, with a view 
to a sharp asymptotics; cf. Remark [2] below for a discussion in relation to our results. 

In this paper we address the question of whether an exact type II error asymptotics in the 
sense of [9] is possible in an adaptive setting. In our approach P is kept fixed, while we aim 
for adaptation over the ellipsoid size M. First, we present a negative result for adaptation 
at the classical separation rate ([2]). 

Theorem 1 Suppose c > 0, < M\ < M 2 < oo and p n ~ c • n _4/3// ( 4/3+1 ) . Then there is no 
test (f> n satisfying E n fl<p n < a + o{l), not depending on i = 1,2 but satisfying both relations 

ni Pn, 

P, < <&{z a - ^jA{c, P, Mi) /2) + o(l), i = 1,2. 



This result states that adaptation even just over M is impossible at the separation rate. 
Instead, we enlarge the radius slightly and examine how the minimax error approaches zero. 
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To be specific, we replace the constant c in p n ~ c • n 4 ^/( 4 ' 3 + 1 ) by a sequence c n tending 
to infinity slowly. In that case the minimax type II error bound of Proposition [TJ namely 
$(z a - y / A(c,f3,M)/2) will tend to zero (since A(c, /3, M) as defined in © contains a factor 
c 2+i/(2/3)^ "When the log-asymptotics of this error probability is considered, as in moderate 
and large deviation theory, it turns out that adaptation to Ermakov's constant is possible. 



Theorem 2 Assume c n — > oo and c n = o(n K ) for every K > 0. If p n = c n ■ n 4 ^/( 4 ^ +1 ) then 
there exists a test <j) n not depending on M such that 

EnfiCpn <a + o(l), 

and for all M > 

1 An(5)M- 1 /(2« 
limsup 2+1/(2/3) l°g ^(^ 

f3,M)< 5^- . 



However now, since the optimality criterion has been changed, a formal argument is needed 
that no a-test can be better in the sense of the log-asymptotics for the error of second 
kind. Such a result is implied by Theorem 3 in Ermakov [11J . where the nonadaptive sharp 
asymptotics is studied in a setting where p n = c n ■ n _4/3// ( 4/3+1 ) with c n — > 00, hence type II 
error probability tends to zero. 

Proposition 3 Under the assumptions of the previous theorem, any test 4> n (possibly de- 
pending on M) satisfying E n $(j) n < a + o(l) also fulfills 

lirninf 2+1/m log p n , ft M) > . (9) 

Cn 

This result is implied by Theorem 3 in [11], and hence the proof is omitted. 
To further discuss the context of the main results, we note the following points. 

Remark 1 Logarithmic vs. strong asymptotics. In [11] it is also shown that, for nonadaptive 
testing where p n = c n ■ n -4/3// ( 4/3+1 ) , c n —> 00, the lower bound Q is attainable, so that the 
minimax type II error defined by satisfies 

log7r n (a,p n ,(3,M) ~ --A^^.M). (10) 

This holds as long as p n <S n -2/3// ( 2/3+1 ). Moreover if additionally p n n~ 3/3 /( 3/3+1 ) then the 
log-asymptotics (|10p can be strengthened to 

vr n (a, Pn , p, M) ~ <5>{z a - y/A(cn,l3,M)/2). (11) 

Results (|1U|) and (jll|) have been obtained within a framework of efficient inference for mod- 
erate deviation probabilities, cf. Ermakov |10j . [12] . Recall that in our setting c n = o(n K ) 
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for every K > 0, so that the strong asymptotics (QTJ holds in the nonadaptive setting. It is 
an open question whether an adaptive analog of (|lip holds. 

For standardized sums T n of independent random variables, if {T n > x n } is a large or moder- 
ate deviation event, theorems on the relative error caused by replacing the exact distribution 
of T n by its limiting distribution are sometimes called strong large or moderate deviation 
theorems to distinguish them from first order results on logP(T n > x n ). For a background 
cf. [32], [21], 0], chap. 11. 

Remark 2 Sharp asymptotics with both f3, M unknown. The adaptivity result of Spokoiny 
|35j . discussed in Proposition [21 about the rate penalty for adaptation (log log n) 2 ^ 4 ' 3-1-1 -', 
does not provide a sharp risk asymptotics in the sense of either PropositionQ]or our Theorems 
[T]and [2j Some results in this direction are presented in section 7.1.3 of Ingster and Suslina 
|23j . To clarify the relation to our setting where (3 is fixed and adaptivity refers to the size 
parameter M, let us discuss these results here. 

Let us first reformulate the result of Proposition [JJ (that is [9]) for known /3,M in a certain 
dual way, where a given type II error is prescribed and it is shown to be attainable on a 
radius sequence p n which then varies with /?, M. Suppose a S (0, 1) and d > are given, and 
suppose the radius p n satisfies 

where A\ (j3) = (Ao /2) -1 ^ 2 , and Aq (/3) is given by ([6j). Then for any sequence of tests 
4> n satisfying E n $(fi n < a + o(l) we have 

y((f) n ,p n ,P, M) > $(z a -d) + o(l) asn^oo, 

and there is a sequence (j) n (depending on /3, M) attaining this lower bound. This follows 
directly from Proposition Q] by setting d = \J A{c, /?, M)/2 and solving for c. 
In the setting of [23], the smoothness parameter j3 varies over a range fa], as in Proposition 
[21 To state the lower asymptotic risk bound, assume that < /3i < P2, that M > is fixed 
and define 

T = {(/3,M) :/?! </?</3 2 }. 
Let D £ R be arbitrary and define a radius sequence p n ,p,M by 

(Pn,p,M) iWW = 08) MW ((2 log log n) 1 / 2 + D) . (12) 

The lower asymptotic risk bound (a variation of Theorem 7.1 in [23]) can then be formulated 
as follows. For any sequence of tests (p n satisfying E n fi(p n < a + o(l) we have 

sup ^{(j> n , PnAM ,^M) > (1- a) $(-£>) + o(l). (13) 

(/3,M)gT 

Note in this setting, the test sequences 4> n are assumed not to depend on (3 but the radius 
Pn,p,M does. Note that part (i) of Proposition [2] is implied by (JT3J) by letting D — > —00. 
As to the attainability of this bound, the test provided in section 7.3 of [23] depends on M. 
Indeed in [23J observations are assumed to be Xj = vj + £j, where £j are i.i.d. standard 
normal and v = (vj)°° =1 satisfies restrictions J2jv"j > r 2 , ^2jj 2/3 Vj < R 2 where R — > 00 
and r/R — > (the "power norm" case in the book, where p = q = 2, s = f3; also r is p in 
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|23|). This observation model is equivalent to ours upon setting R 2 = nM, r 2 = np, and 
then Yj = n^^Xj, fj = n 1 ' 2 Vj. The reasoning provided in section 7.3.2 of [23] makes 
it clear that the test constructed uses solutions of an extremal problem under restrictions 
(v : ^2jV 2 > r 2 j 2l3 v 2 < -R 2 j where r 2 = np n ^^M with p n ,/3,M from ([12]) and f3 is from 

a certain grid of values in , /?2 ) - Since in particular R = n 1//2 M 1 / 2 , it turns out that the 
estimator depends on M, though it has been made independent of j3 G , /?2 ) - A version of 
such results for a n -tests with a n — > is given in [24J. 

It should be noted that adaptation to (3 only, with M remaining fixed, does not have a 
practical interpretation in the context of smooth functions. Thus the problem of a sharp risk 
bound for adaptation to (/?, M) remains open in nonparametric testing; for the analogous 
problem in the estimation case (regarding the Pinsker bound), solutions have been presented 
by Golubev [18] and Tsybakov [36], sec 3.7. 

Remark 3 The detection problem. Instead of focussing on the worst case type II error 
^((/> n , p, /3, M) (|3|) of a-tests (fi n , one may consider minimization of the sum of errors, that is 
of E n fi(p n + ty(4>n, P, (3, M), over all tests <p n . That has been called the detection problem in 
the literature; in |23] this problem is largely treated in parallel to the one for a-tests. There 
and in [25] one finds the analog of the nonadaptive sharp asymptotics of Proposition [TJ It 
may be conjectured that analogs of our Theorems [T] and [2] concerning adaptivity hold there 
as well. 

Remark 4 The plug-in method. In the present setting, where the degree of smoothness 
j3 is fixed but the ellipsoid size M is unknown, a natural approach to adaptivity is to try 
to estimate M and use a plug-in method. However uniformly consistent estimators of M do 
not exist (since the unit ball in L2 is not compact), hence for minimax optimality, such a 
straighforward argument fails. In the estimation setting, the solution found by Golubev [T7] 
is to apply, for a biased estimator of M, the same saddle point reasoning which lies at the 
heart of the Pinsker [33] result about minimax optimal estimation. The paper [T7] concerns 
the continuous white noise model indexed by t G [0, 1], and the adaptivity there incorporates 
two local aspects: one with respect to time t G [0, 1] and the other with respect to a local 
variant of Sobolev smoothness classes. For more discussion cf. |20j . 

Our result here is the analog of the one by Golubev |17| for estimation, but in testing it 
turns out that adaptivity is possible only in conjunction with a tail probability (moderate 
deviation) approach. To further clarify the connection to adaptive estimation, in section loTTI 
we present a short outline of the result of [T7] in a simplified setting. 

Remark 5 Quadratic Junctionals. In the literature it has been noted that the nonparametric 
testing problem with an ^-ball removed is related to the estimation problem of the quadratic 
functional Q(f) = ||/||2- in particular, it is known that the optimal separation rate for 
testing pi/ 2 x n" 2 ^/^ +1 ' > (comp. ([2])) and the minimax optimal rate for estimating Q(f) 
over S(/3,M) coincide if < f3 < 1/4, but if f3 > 1/4 then the latter rate becomes n~ 1 / 2 (the 
so-called elbow effect; cf. Klemela [26] and references therein). Butucea [2] gave a unified 
argument for lower bounds in the estimation and testing cases when rates coincide. As far 
as adaptive estimation rates for Q(f) are concerned, the logarithmic penalty factor in the 
"irregular" case < /3 < 1/4 has been established in [7]. In [6] it has been shown that at the 
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point /3 = 1/4 the optimal adaptive rate is n -1//2 c n where c n — > oo slower than any power 
function of n, and for /3 > 1/4, there is no adaptation penalty on the optimal rate n -1 / 2 . In 
the case < (3 < 1/4, the only sharp adaptive minimaxity result for estimation of Q(f) we 
are aware of is in [26J; it concerns a case where the ^-Sobolev class S(/3,M) is replaced by 
an /^-smoothness body with p = 4. 

Remark 6 The sup-norm problem. Lepski and Tsybakov [29J proved a sharp minimax result 
in testing when the alternative is a Holder class (denoted H (f3,L), say) with an sup-norm 
ball removed, which is a testing analog of the minimax estimation result of Korostelev [27] 
and also a sup-norm analog of Ermakov [9] . For adaptive minimax estimation with unknown 
{(3,L) in the sup-norm case cf. [19]; for the testing case where (5 is given, Diimbgen and 
Spokoiny [5] established a sharp adaptivity result with respect to the size parameter L only. 
The result in Theorem 2.2. of [5\ can be seen as a analog of the one given here, although the 
methodology in the sup-norm case is much different due to the connection to deterministic 
optimal recovery, cf. [29j . The case of unknown (/3,L) seems to be an open problem in the 
sup-norm testing case, with regard to sharp minimaxity, although in [5] a test is given which 
is adaptive rate optimal without a log log n-type penalty. Rohde [34] discusses the sup-norm 
case for regression with nongaussian errors, combining methods of [5] with ideas related to 
rank tests. 

Remark 7 Density, regression and other models. The phenomenon of the log log n-type 
penalty in the rate for adaptation when an Z/2-ball is removed, as found by [35J, has also 
been established in a discrete regression model [15] . and in density models with direct and 
indirect observations [13], [3]. For a review of adaptive separation rates and further results 
in a Poisson process model cf. [13] . 

The structure of the paper is as follows. In Section we discuss the background, for the 
nonadaptive setting, of the sharp asymptotic minimaxity result for testing of Ermakov [9] 
and its analogy to the Pinsker [33] constant. In Section [3] we present the proof of Theorem 
[T] about the lower bound (the necessary penalty) for adaptation and in Section [JJ Theorem [2] 
concerning attainability is proved. In an appendix (Section l5.ip . we present some more back- 
ground for the reader, by giving a brief sketch of the estimation analog of our nonpar ametric 
testing result (Golubev [17]). Finally, Section [5.21 contains some proofs for the background 
Section [2j 



2 The Bayes-minimax problem for nonparametric testing 

The purpose of this expository section is to elucidate the analogy between the Pinsker con- 
stant [33j for /2-estimation over ellipsoids and the constant found by Ermakov [9] for nonpara- 
metric testing over ellipsoids with an /2-ball removed. We draw on the backgound explanation 
given in [23], sec. 4.1, but we focus specifically on the fact that very similar Bayes-minimax 
problems are at the root of the estimation and testing variants. For the theory underlying 
the Pinsker constant cf. [I], [31j . [36] . 
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For this exposition, we shall assume that observations (HJ are for j = 1, . . . ,n; we will thus 
assume / G M. n and understand the sets E(/3,M) and B p accordingly, i.e. they refer only to 
the first n coefficents of /. By ||-|| and (•,-) we denote euclidean norm and inner product 
in W 1 . Since most expressions will depend on n, for this discussion we shall often suppress 
dependence on n in the notation. Assume that the radius p tends to zero at the critical rate, 
that is p >c n - A P/i A P+ 1 ) . Let K™ = [0, oo) n ; for a certain d G K+, consider a quadratic statistic 
of the form f = n£? =1 djYf. Under H , we have E^ n f = YTj=i d j and Var , n T = 2 ||d|| 2 . 
Since we will work with the normalized test statistic, obtained by centering and dividing 
by the standard deviation, it is obvious that we need only consider coefficients d fulfilling 
||ci|| 2 = 1. Accordingly define, for such coefficients d, the statistic 

Under Hq, we now have EqT = and VaroT = 1. We will consider quadratic tests 

1> d = \{T>z a }. (15) 

A further condition on d is imposed by requiring d G T>, a set which is defined for a given 
sequence 5 = (logn)" 1 as 

V = {d G M+ : \\d\\ 2 = 1 and supd 2 < 5/np). (16) 

j 

For any test, we are interested in the worst case type II error under the constraint / G 
£(/?, M) n Bp. A monotonicity argument shows that for every ip^, this is attained when II /II 2 
is minimal, i.e. at ||/|| = p. It follows that for quadratic tests tpd, we may replace the 
restriction / G B p by / G B' where 

B> p = {feWL n :p<\\f\\ 2 <2p}. 
For / G W 1 we set f 2 := (/ 2 )™ ^ For d £ V and g G define the functional 

L (d,g) = —= (d,g) . 



Lemma 1 (a) Under Hq, we have T ~» N(0, 1) uniformly over d G T>. 

(b) The statistic T given by m\ ) fulfills 

T-L(d,f 2 )^N '(0,1) 

uniformly over d £ T> and f E B'. 

(c) Suppose f is random such that fj ~ ^0,cr|^ for a certain a G W 1 . Then the statistic 
T given by {1$ fulfills 

T-L(d,a 2 ) A(0,1) 

uniformly over d G T> and a G B' p . 
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Denote the expectation under the model of (c) by E*. The lemma implies that for uniformly 
over d G V and / G {0} U (S(/3, M) n B£) 

E / (l-Vd) = *(^-^,/ 2 )) + o(l) (17) 
= £Kl-^)+o(l). (18) 



In particular, all quadratic tests tpd with d £ T> are aymptotic a- tests under Hq : f = 0. To 
characterize the worst case error under the alternative H a : f G £(/3,M) n B p , we use (fTT|) 
and the strict monotonicity of $ and look for a saddlepoint of the functional L(d, f 2 ). 

Lemma 2 For n large enough, there exists a saddlepoint do G £>,/o G £(/3,M) n B' p of the 
functional L(d, f 2 ) such that 

L(dJ 2 )<L(d ,f 2 )<L(d J 2 ) 
for alldGV and all f G EQ3, M) n S^. 

The normal distribution on the signal / postulated in (c) will be interpreted as a prior 
distribution. The next result shows that the Bayesian tests in this context are quadratic 
tests V'd; and in particular, if the cr 2 is taken at the saddlepoint (a 2 = /q ) then d G V, i.e. it 
fulfills the infinitesimality condition d 2 < 5/np. 

Lemma 3 (a) For any a 2 G W}_, the Neyman-Pearson a-test for simple hypotheses 

H o :Y j ~N(0,n- 1 ),j = l,...,n vs. 
H* a :Y j ~N(0,a] + n- 1 ),j = l,...,n 

is equivalent to a quadratic test of form ipd = 1{T > t} where T = Y^j=\ djYf , d G M", 
\\d\\ = l. 

(b ) If a 2 = /o then the pertaining d is in T) for ft large enough, and t — y z a . 



Part (b) implies that 



We are now ready to present the essence of the argument underlying the result of Ermakov 
[9]. Recall that TT n (a, p, (3, M) denotes the minimax type II error over all a-tests. Denote the 
value of L(d, f 2 ) at the saddlepoint 

L :=L(d ,/o) = sup inf L n {dJ 2 )= inf supL n (d,/ 2 ). (20) 

dev /eS(/3,M)nB^ f&T,{p,M)r\B' pdeD 
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We begin with an a' > a such that asymptotic a-tests are a'-tests for n large enough. Then 
7r n (c/ , p, 0,M)= inf sup E f (1 - 0) (21) 

< inf sup Ef(l - tp d ) 
dev / e s(^,M)ns p 

= inf sup £7/(1 - ip d ) 
dev / e s(/3,M)n^ 

= inf sup ${z a - L n (d, f 2 )) + o(l) [relation (fTTl) ] 

rfeI) feE(/3,M)nB' p 

= <&(z a - L n (d , fo )) + o(l) [monotonicity of $ and ([20]) ] 



= inf E%(1 - ij) d ) + o(l) [relation (JTS])] 
= inf J51 (1 - 0) + o(l) [relation {Tg}]. 

<t>:Eo4><ot J 

The main term of the last expression is the Bayes risk for a prior distribution fj ~ N(0, Jqj) 
in the original model Yj ~ AT (/j,n ). Since /o € X(/3, M) n -Bp and is extremal there, it 
fulfills 

n n 
3=1 3=1 

(see the precise description of the saddlepoint (do, fo) in Lemma [7] below). It can therefore be 
shown that (as in the original Pinsker [33j result) that this prior distribution asymptotically 
concentrates on every set of the form E(/3, M(l+e))nB' p ^__ £ ^ for e > 0. A standard reasoning 
by truncation shows that in this case, for a certain probability measure G strictly concentrated 
onE(ftM(l + £ ))n J B; M 

. fcf E%(1 - < inf / 25/(1 - 0)dG(/) + o(l). 
However, by the relation between Bayes and minimax risk 

inf / E f (l - 4>)dG{f) < vr n (a, p(l - e),(3, M(l + e)). (22) 

<p'-Eo<p<a J 

Summarizing (|2ip - (|22p we have obtained for every e > 

7r n (a(l + e),p, P, M) < <5>{z a - L n (d , / 2 )) + o(l) < 7r n (a, p(l -e),/8, M(l + e)) + o(l) 
Below in Lemma [5] is it shown that if p = c ■ n -4/3 /( 4/3+1 ), c constant then 

L(d ,/ 2 ) ~ yjAoM-Vm<?+VW)/2. 
Since the right side is continuous in M and c , the result of Proposition [1] follows. 
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3 Proof of Theorem [T] 



For brevity we write A{ = A(c,f3,Mi),i = 1,2 in this section. Assume there exists a test 
not depending on on M such that 



E ,n<f>n <a + o(l), 

sup £/,„(1-^)<%-V^/2) + o(1)> 

/e£(/3,M*)aB P 



(23) 
(24) 



for i = 1 or 2. Let G U: Mi be the Gaussian prior for / with fj ~ N(0,a* ) independently, 
where 



af(M l ) = {\-^) + , j = l,2,... 



and where A and [i are determined by 



^j^af = Mi and J>f 



It can be shown that G n> Mi asymptotically concentrates on S(/3, Mj (1 + s))f] B', x _ e *. for any 
small e > 0. Then 



sup E f , n (l - 4>n) > (1 + o(l)) • E fjn {l-<j) n )G nMi (df)- 

s( y 9,M i (i+ £ ))ns; (1 _ e) 7 

Recall = /j + n™ 1 / 2 ^-. Let the joint distributions of (ij)o° under the priors G^o, G^Mi 
and G ni M 2 be <9o,n,Ql,n and Q 2 ,n, respectively, i.e., 



Qo,n 

Ql,n 
Q2,n 



Yj^N^n- 1 ), j = l,2,... 
Yj-N&n^ + afiMi)), j = 1,2,... 



Yj ~ N(0, rT x + a* 2 (M 2 )), J = 1,2, 



Therefore, 



Eo,(l-6„ 



n ) = J E f , n (l - 4>n) G n , Mi (df), i = 1,2. 
Combining these with (f24"|) and (f23j) gives 

£q ,„<^ < « + o(l), 



%„(i-^)<$h a -V^/2 



+ 



SUp #/,n(l - 0n) - SUp Ef >n (l - <j> n ) 

/eK(^,M i (i+e))nB^ (1 _ e) fe^,Mi)nB p 



Note that E^ n (l-4> n) is continuous in /. Since e can be arbitrarily small, we have 



% B (i-W<*k-v^72 )+o(l), * = 1,2. 
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The likelihood ratio of Qi n against Qq n is 



2 



dQo,n I 2 ^ U-l + CJ* 2 (M. 



^2 



c::p 1 1 V n ^ {Mi) Y 




Therefore, by the factorization theorem, it is seen that the bivariate vector 



(1 + naf {M 1 ))^2v? £ fc of ( M (1 + naf (M 2 ))yj2n* £ fe of (M 2 

is a sufficient statistic for the family of distributions {Qo,n, Qi,m Q2,n}- Write the induced 
family for T n as {Qq n , Qj n , Q 2 n } and take the conditional expectation (/>* (T n ) = Eq. n (0 n |T„). 
By sufficiency the (possibly randomized) test </>*(T„) for {Q^ n , Qj n , Q 2n } i s as good as 4> n 
(cf. for instance Theorem 4.66 in |30j), that is 

E q t 4>* n = E , n <p n <a + o{\), (25) 



Eqt (1 - <t? n ) = E Ql ^ n < $(z a - ^A~j2) + o(l), i = 1, 2. (26) 

^ z,n 

Then we have the following lemma, which is proved later. 



Lemma 4 Under {Qo t n,Qi,n,Q2,n} , the law of the statistic T n converges in total variation 
to iV(0,X), N(fii,Yi) and N(fj, 2 ,T,) respectively, where 



M x 1/(4/3) 4/3 + 1 _ Mi/M 2 



M 2 y 4/3 



(27) 



Then by the weak compactness theorem (cf. [28], A. 5.1 ), there exists a test 0* and a 
subsequence (/>* fc such that (j>* converges weakly to (/)*. Thus 

E n r 6* < a, 

£ q t (1- <f>*)< <S>(z a -yfA~/2), z = 1,2. 

For i = 1, 2 respectively, by the Neyman-Pearson lemma and some direct calculations, the 
right hand side of the previous inequality is the type II error of the uniformly most powerful 
test for ./V"(0, S) against iV(/Xj, £). Therefore, 4>* is a uniformly most powerful test for A^(0, S) 
against {iV(/ii, £), iV(/i 2 , £)}. 
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Note that r in Lemma H] is monotone increasing with respect to M1/M2, and then < r < 1 
for Mi > Mi > 0. Thus, fix, [12 and the origin are not on the same line. For i = 1, 2 
respectively, the log-likelihood ratio for iV(/Zj, E) against N(0, E) is T'" 1 /^ = Tj -vlj. Then by 
the necessity part of the Neyman-Pearson lemma ([28], Theorem 3.2.1), the uniformly most 
powerful test for N(0, S) against iV(//j,E) has the form of > fcj}. But since these two 
types of tests can never coincide, there is no uniformly most powerful test for iV(0, E) against 
{N((j,i, E), N(/j,2, E)}. By this contradiction, Theorem[T]is proved. 

Proof of Lemma [4j For simplicity, we only show the result for the first coordinate of 
T n . The proof can be extended to T n naturally. Under (5o,n> the characteristic function 

of n[Y ^' n) ~ JV(0,1) is g{t) = exp(-t 2 /2). Note g(t) = 1 - \t 2 + o(t 2 ), as t -> and 
J \g(t)\ < 00. The density of T Ui \ can be written as 

2W V( 1+no i 2 ( M i))\/s fc ^ 4 (^i)/ 

where, by Levy's continuity theorem, the integrand converges to e- itx exp{-t 2 /2}. By split- 
ting the integral into two parts and using dominated convergence, it can be shown that the 
integral converges to 

1 r „ c -z 2 /2 

1- e-^e- t2 / 2 dt= 6 -^. 
2ttJ ^ 

Then an application of Scheffe's theorem (cf. [37], 2.30) establishes convergence in total 
variation. The correlation r can be calculated directly. ■ 



4 Proof of Theorem [2] 

Choose N and 7 n = o(l) such that 

7 l/2/3 . n 2/(4/3+l) ^ » c-1/2/3 . n 2/(4/m) 5 (28) 

e.g. 7n = c ~ 1/2 , N = c~ 1/W ■ n 2 /^ 1 ). Define 

N 

A/ = M„(/) = ^j 2 "/2+7„, 

J J =J,(M ) = A[l-(i/iV) 2 ^ + , 
which all depend on the unknown /. Define the oracle statistic 



1 II 



n 2 ^ j d j (M )Y 2 -n^ j d j (M ) 



2n 2 £^ 2 (M 



14 



and the oracle test 0* = 1{T* > z a }. The following lemma holds; it is proved later. 



Lemma 5 Under the assumptions of Theorem^ the oracle test 0* is an asymptotic a-test 
and 

limsup c2+1/m log Pn , P, M) < 



Define 

N 

and introduce the statistic 



n 2 J2 dj {M)Y 2 -nJ2dj (M) 
2n 2 Zdj(M) 



and also the test 

(j) n = l{T n > Z a }. 

For M, we have the following lemma, which is proved later. 

Lemma 6 Under the assumptions of Theorem^ we have 

M 



M (f) 

uniformly for f £ M) n B p . 



1 = Op(l), 



Now rewrite 

Tn = Y dj(M) Y 2 -l/n 



2 



where dj(M) = A(l — {j /N{M)) 2 ^) + . Since A in the last display can be canceled, for simplicity 
we write dj(M) = (1 — (j /N(M)) 2/3 ) + from now on in this section. First, since N(M) > 
A r (7n), we have 



2j3\ 2 



N(M) 



N(M) / (l-t 2l3 ) 2 + dt 
Jo 

N(M)K(P). 
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Therefore, 

, NN MM) Yf ~ 1/" 

r n = (i + (i))^ >\ ' ■- L 7 =r- 

By Lemma El 

T n = (l + o 1) V JV 7 • J ; . 

At this point, make M independent of lj 2 by sample splitting. Set n = rn + (1 — r)n, where 
r is close to 1 but fixed, and n\ = rn, n 2 = (1 — r)n. Assume two sets of observations 

Y lj = f J +n- l/2 £ lj ,j = l,2,... (29) 
Y 2j = f j + n- 1/2 ^ 2j ,j = 1,2,... (30) 

Use {^2j} to obtain M, and now replace T n by 

J(M) - n- 1 



T° = (l + o(l))^ 



y/N(M o (f))K(0) \/2n- 



Denote the difference of coefficients by Aj = dj(M) — dj(Mo(f)). Note the largest difference 
is obtained at j « min{iV(M), iV(M (/))}. Then 

|M-M (/)| 

7n 

uniformly for all j. Note in T\ there are at most C^Cn ^ , n 2 /( 4 P+ 1 ) nonzero coefficients. 
Then 

/7„ r - 1 /( 2 «„2/(4/3+l) 

r--(i + o(i)) V , d JMM „ i - 



y2._ n -i 

where 77^ = ^ n _\ > and 



-1/(2/3) 2/(4,8 + 1) 

l '* A 



3=1 



y/N(M (f))K(P) 



Under Hq, the r.v. 's rjj are independent of M and £?7yj = 0, Var(ryj) = 1. Thus Var(r n ) 
£r 2 = ££(r 2 |{y 2 ;}) and 

r' n ^ _1 /( 2,3 '» 1 2/(4 / 9 + l) „ 

Therefore, by the result for Var(M) in the proof of Lemma [6l 

E\M - M (/)| 2 _ Var(M) 2if(/3)iV 4 ^+ 1 4N 2 ^M 

Var(r n ) < 2+1/(2/3) ~~ 2+1/(2/3) - 2 2+1/(2/3) H 2+1/(2/3) ' 

7n 7n « In nj n 
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where the last two terms converge to by the first inequality in (J28J). Hence, under Hq, the 
r.v. 's T n and converge to N(0, 1) in law. 

Next, we consider T n or under the alternative. The worst case type II error is determined 
by the following quantity 

^w nB ' (E^(M)^ V 

First, since N{M) > (^V^ ■ n 2 /(^ +1 ) -> oo, 



iV 2 

= (l + o(l))JV f (l-t 2 P) 2 dt 



Second, consider 



Note 



< 1 + ^'WTTWW (31> 



N N 

E/^w = E/K 1 -(^) 2/9 )- 

i=i i=i 



N oo oo 

E /?' ~~ E ~~ E / 3 

i =1 i =1 j=N+i 
> p-N~ 2li M 
( M \ 

= p(l + (l)), (32) 



where the last step is refers to the second inequality of (|28p. On the other hand, since N ^> N 
and JV(M) = [(4/3 + ljM/T 1 ] 1 /^, 



AT N N 



E//ow 2/3 + E /1<E/I(^) 5 

3=1 3=AT+1 3=1 

< A^ 2/3 Af (/) 

= / o(l + 4/3)- 1 . (33) 



Combining ([35|) - ([37]) gives 



N .„ 



17 



Combining this with (|34p gives 

(2E^) 1/2 v^V 4/3 + 1 P/iV 

> (l + o(l)) 



(2/3 + 1)6 



2+1/(2/3) 



(4/3+l) 1 +V(2/3)(M + 7 „) 1 /(2/3) 

>(l + o (l))yiA (/3)^ +1/(2/3) M-V(2/3) 

Theorem [2] is proved. 

Proof of Lemma [5], Rewrite 

d 3 (M (f)) Yf-l/n 



J 



Edj(M (f)) ' 



Under Hq, we have / = 0, and Mo(/) = 7 n . Since 

V[l - tf/JV^ft ~ iV • f\l - t^f dt = K 03) • fa/cjWnWW, 

Jo 

then 

rf 3 -(M (/)) 1 =n(1) 

/E^(M (/)) • (7n/Cn) 1 /^n2/(4/3 + D 

uniformly for all j. It can be shown that T* converges to N(0, 1) in law. 
By similar arguments, the worst type II error is (1 + o(l))$(z — L n ) where 



L„ = inf 



/ e s(/3,A/)nB P (2^d|)V2 

Note dj = dj(Mo(f)) depending on /. By the second inequality of ([25]) . we have iV 3> 
N(M (f)) and ^ = 0, for j > N, 



n 



L n = J_ inf 

\/2/es(M)nB P (£d?) 1/2 



First, since N(M (f)) > (^V ^ ■ n 2 /( 4/3+1 ) -»• oo uniformly for / G E(/3,M) n B p , 



iV 

«? = a 2 E( 1 -C?/^) 

j=i + 

= (l + o(l))A 2 iV f (l-t 2l3 ) 2 dt 



"^^■ pXif <34) 
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uniformly for / € £(/?, M) Pi B p . Second, consider 



N N 

i=i j=i 



jV 



N 



N 



i=JV+l 



(35) 



Note 



A? 



£/J=£/|- E /: 

j =1 j =1 j=N+l 

>p-N~ 2li M 

( M \ 
= p 1 



p(l + o(l))> 



(36) 



where the last step is due to the second inequality of ([28]) . On the other hand, since N 3> N 
and N = [p-^Ap + l)M (f)] 1/m , 

N N N 

£/J(;/ao 2/3 + E /|<E/K^) 2/3 

3=1 j=Af+l 3=1 

= p(l + AP)- 1 (37) 



Combining (j35]) - (j37j) gives 



N 



E/M>(i+o(i))V' 

i=i 



4/3 



4/3 + 1 



uniformly for / E £(/3, M) n .Bp. Combining this with (|34p gives 



^#>(i + o(D) 
(2Erf?) 1/2 " 



_n_ 7 2(2/3 + 1) 
v^V 49 + 1 



p 2 /iV 



>(l + o(l))^ 
>(l + o(l))^ 

uniformly for / £ S(/3, M) n Bp. Therefore, 



(2/3 + 1)6 



2+1/(2/3) 



(4/3 + l) 1 +V(2/3)(M + 7ri ) 1 /(2/3) 



(2/3 + l)c n 



2+1/(2/3) 



(4/3+ l)l+V(2/3) M l/(2/3)' 



in > (1 + o{l))^MP)& 1,m M-VW), 



and the result follows. 
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Proof of Lemma [6], Since 



N 



Var(M) = ( 4 + — J F 



< 1 + o l + . 



n 



by the first inequality of ([2"8"]h 



Var(M) 



oil) 



uniformly for / G SnVp. Combining with i£M = Mo(/) and using Chebyshev's inequality 
give 

M-Mo(f) 



In 



Op(l), 



and then 



uniformly for / G Sfl K,. 



M 



M (/) 



< 



M-M (/) 



7n 



5 Appendix 

5.1 Adaptive minimax estimation with known /? 

For the convenience of the reader, we sketch the modified plug-in method of Golubev [17] 
allowing to attain the Pinsker bound for known smoothness j3 and unknown bound M, in the 
framework of Sobolev ellipsoids. For more comprehensive results, allowing also for unknown 
(3, cf. [18], [36]. Consider the estimation problem for / = (/ J )°^ =1 , with squared Z2-I0SS, in the 
Gaussian sequence model 

with / G £ (/3,M). With known (5 and unknown M, the aim is to find an estimator which 
is asymptotically minimax in the sense of Pinsker [33] . For known M, the optimal filter 
coefficients are (1 — fijP)+, where fi is determined by 



11 

Since 



71 ^ 



M(fi + 1) (2/3 + 1) 
the optimal truncation index (or bandwidth) is of the order n 1// ^ 2 ^ +1 - ) . 
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Choose n 1 /( 2/3+1 / 2 ) > N > n 1 ^ 2 ^ and 1 > 7n > N 2 ^ +l / 2 /n, and define 



JV 



Define iV = N(Mqj) = a • n 1 /^^) Mq , where a is a constant to be chosen. Define 

"oracle" filter coefficients, depending on /, as 



dj = d{j/N), where d(t) = (l - . 
Consider the oracle estimator (djYj)^°. Its risk is 



n 

N 



i= l j>N 



:=A 1 + A 2 + A 3 . 
To bound the terms Aj, note first 

Ax < sup(l - d 3 ) 2 j- 2 ^M 0J < N-^Moj = a- 2 ^n- 2 ^ +1 \M + 7 n) 1/(2/3+1) - 

Second, A 2 < J2j>n fj ^ = o(n- 2 /W +1 )). Furthermore, 



^3 = -^E( 1 -(^)^ 

n N ^-^ 



poo 

I (1 - iP)\dt (1 + o(l)) uniformly over / G E M) 



Combine these and choose a = ^— — — - ) , and we find that the supremal risk, 



< an -2/W+D M l/(2/3+l) . ^ (1 + (1)) . 

(/3 + l)(2/3 + l) 1 v " 

^ V(2/3+l) 

over / G E (/3, M), of the oracle estimator is at most 

C (/3) • n -2^/(2^+l) M l/(2^+l) (1 + o(1)) ; (3g) 

where 

f R \ 2/3/(2/3+1) 

c(/3) = (^ T ) • (1 + 2/3)^) 

is the Pinsker constant. 



The next step is to show that the risk f|38|) is also attained when the unknown Mq j is replaced 

:fci^/| + 7n, where f 2 = y( 

Nn 



by an unbiased estimator. The latter is M n = J2f=i J 2 ^ fj + In, where fj=y 2 — n 1 . Then 



E(M) = Y,3 W f] + 7n = M 0i/ < M + 7n 
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and 



where the first term 



N 

Var(M) = ^ j 4/3 Var(y/) 
3=1 
N 

= ^/^- 2 (2 + 4n/|) 
3=1 

N N 

3=1 3=1 
= Jl + J2, 



Jx = 2n- 2 iV 4 ^ +1 • i J2 (i/^) 4/? ~ 2n- 2 iY 4 ^ +1 • £ x^dx = o(l) 



since N = o(n 1// ^ 2 ^ +1 / 2 ^), and the second term 



N 



3=1 



uniformly for / G T,((3,M) since TV S> n 1 /( 2/3+1 \ Combining these gives Var(M) = o(l) 
uniformly for / G £(/3,M). Recalling 7„ » N 2 ^ 1 ! 2 /n gives 



Var 



and then 



uniformly. 



( M-M 0J \ 


2Kn~ 2 N^ +1 


\ In 




ll 




M 


< 


M - M Q j 

In 


= o p 



Finally, it can be shown that the difference between the oracle estimator (djYj)^ and the 
estimator (d{j / N {M))Yj\ is negligible, i.e. 



E^2[d(j/N(M 0J )) - d(j/N(M))) Y 2 = o(n- 2 ^ +1 > 

3=1 



5.2 Proofs for Section [2] 

Proof of Lemma Q3 (a) Under the null hypothesis we have Y 2 = n _1 £ 2 , hence T = 

£ dj (gj - l) /y/2. Then it follows from (USD and np — > oo that the CLT infinitesimality 
condition 

sup d 2 = o(l) 

3 
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holds uniformly over d 6 T>, proving the assertion, 
(b) Since Yf = ff + 2n" 1 / 2 /j^ + we have 

T = 71 E d i (Vj + 2 ™ 1/2 /^ + (ef - 1) ) , (39) 
t - L(d, /) = ^ E ^ ( 2 » 1/2 /^- + K - !)) • ( 4 °) 

An easy calculation gives 

Var / T=I^4(4n/| + 2) = l + 2n^4/| 
where in view of (|16p we have for f E B' p 

Consequently, VarjT — >• 1 uniformly. Now the CLT infinitesimality condition on the sum 
(|40|) amounts to 

sup d| (n/? + 1) =o(l). (41) 
j 

For f £ B' p we have /? < 2/9, hence in view of (|16p 

(n/| + 1) < 4 ( 2n /° + 1) < 2-5 

for n sufficiently large. Hence (|4ip is fulfilled uniformly over d £ T> and / G -Bp, and the 
claim follows. 



(c) Set fj ~ N(0,a]); then in view of ([39]) 

T - a) = E 4 (2« V2 /iO + ft? " 1)) + "i E * (/I " °?) • ( 42 ) 



An easy calculation gives 

Var /2 1 = - E 4 ( 4n(T i + 2 ) + n E d 

= i+«E c ?( 2o i +ff i) 



where in view of (|16p we have for a £ B' p 



E^<*p~ 1 E o ?^ 5W = °( 1 )' 
ErfM< 2 p«E d M< 4 M = (i)- 



Consequently, Var — >• 1 uniformly. Now the infinitesimality condition on the sum (I42D 
amounts to 

sup (i| (l + ncrj + naf) = o(l). (43) 
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For a £ B' p we have a 2 < 2p, hence in view of (|16|) 

d 2 (l + no) + no- 4 ) < d 2 (l + np + np 2 ) < 3<5 

for n sufficiently large. Hence ()43|) is fulfilled uniformly over d £ T> and cr £ -Bp, and the 
claim follows. ■ 



Proof of Lemma [2] . Let T> be defined as T> in (I16p but with condition ||(i|| 2 = 1 replaced 

by \\df < 1. Then, since L(d, /) is linear in d, for every d £ T> there is a d 6 P such that 
L(d, / 2 ) < L(d, f 2 ) for every /. Hence it suffices to prove the claim for T> replaced by the 
compact convex set T>. The restriction / 6 S(/3,M) n B' is equivalent to f 2 being in the set 

{g eRl:J2 9jj W <M,p<Y,9j< ^p] (44) 

which is convex and compact (and nonempty for large enough n since p — ¥ 0) . The functional 
L is bilinear in d and f 2 ; the standard minimax theorem now furnishes the result. ■ 



Lemma 7 For n large enough, the saddlepoint g?o,/o of Lemma\^is given by 

J o 



|/o| 

where A, ^ are t/ie unique positive solutions of the equations 



1, . . . ,n 



£ ^ ( A - « 2/3 ) = M > E ( A - ^) = ( 45 ) 

i=i + i=i + 



The value of L at the saddlepoint is 



L = L(do,f ) = -j=\\f$\\. (46) 



Proof. Ignore initially the restriction sup^ d 2 < 5/np and consider maximizing L(d, f 2 ) in d 
for given /. Under the sole restriction = 1, by Cauchy- Schwartz the solution is found as 



d(f)- ' 



II/ 2 



It remains to minimize L(d(f), f)=n \\f 2 \\ / y/2 under the restrictions on f 2 . Setting gj = f 2 , 
one has to minimize ||g|| on the convex set (1441) . This is solved using Lagrange multipliers 



X,p. 

To show that the solution do fulfills the restriction sup^ d 2 < 5/np, we note that 

ffc = (A - N W ) =\(l-p\- 1 ^) <A; (47) 
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below (cf. ([MP ; Lemma [8]) it is shown that A x n 1 1 /( 4 ' 3 + 1 ) and n ||/q || x L nj o x 1. This 
implies 



n M),nj = np-0 (n 2 A 2 ) , 



thus for 5 = (logn) 1 we have that do £ T> for n large enough. 
Proof of Lemma [3], The log-likelihood ratio is 



log- 



(n-) 



nn/2 



ct 2 + n- 1 



^ 6XP 1 2 



1 n 



i=i \ J 



n_ 

a 2 +n~ l n^ 1 



Y 2 

3 



l -YnY 2 (^i-\- n -Y 
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This shows (a) by setting d = dj 

X x n - 1-1/(4/3+1) 



(48) 



for dj 



— . Now for a! = /oj we have, 



as 



n/ 2 . = nA (l - A-Vi 2/? ) < n\ x n • n-^V^+D = n -l/W+i) = (l), 



hence dj ~ n /oj uniformly over j = 1, . . . , n. This implies 



n /o x n and 



d - —L - f 2 
d 

uniformly in j < n. The proof of npd\ n j<5 now exactly follows (|47|) . (|48|) . The convergence 
i — >• z a now is a consequence of Lemma [T] (a) . ■ 



Lemma 8 Suppose p = c ■ n 4 ^/( 4 ^ +1 ) ; c constant. Then the saddlepoint value Lq of p£ 
fulfills 

L = L(d , / 2 ) ~ V /AoM-V(2^) c 2+i/(2^) /2 . 



Proof. The proof of Lemma [7] shows that L(do,/g) is also the saddlepoint value under the 
weaker restrictions ||d|| 2 < 1, / G M)nB p . Let us sketch a derivation of the asymptotics 
by a renormalization technique. Suppose that dj = h l l 2 d(hj), j < n where h is a bandwidth 
parameter tending to 0, and the continuous function d : [0, oo) — > [0, oo) satisfies 



d 2 {x)dx < 1. 



Consider another continuous function a : [0, oo) — > [0, oo) satisfying 

/■oo /*oo 

/ z 2/ V(2;)dx < 1 and / a 2 (x)dx>l 
Jo Jo 



(49) 



(50) 



25 



and set cr] = Mh 2/3+1 a 2 {hj), j < n . Choose h = (p/M) 1 /^. The coefficient vector 
d = {dj)™ =l satisfies 

n POO 

\\d\\ 2 = hJ2d(hj) -»• / d(x)dx<l. 

3=1 J ° 

Identifying f 2 G with (cr 2 )™ =1 , the restriction / G £(/3,M) is asymptotically satisfied 
since 

OO OO «QQ 

53 j 2l3 a 2 = MhJ2{jh) 2 ^a 2 (jh) M / x 2/ V(x) (fx < M, ft 0. 
j=i i=i ^° 

The restriction f £ B p is also asymptotically satisfied since 

OO OO OO /*00 

53 ^ 2 = M^ +1 ^ a 2 (j/0 = ^ ^ <x 2 (^) ~ p / <x 2 (x) dx > p. 

3=1 3=1 3=1 " 

Therefore, 

n oo 

TTfE^ = ~j^Mh 2l3+l / 2 h 53 d(jh)a 2 (jh) 

3=1 3=1 

a+l/(4/3) M -l/(4/3) /-oo 

— 7= / d(x)a (x)dx. 
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The saddle point problem (I20p for each n is thus asymptotically expressed in terms of a fixed 
continuous problem with constraints (|49p and (|50p . There is unique positive solution (A*,//*) 
for the equations (cp. [16]). 

x 2/3 (A-//x 2/3 )dx = 1, (51) 
(A-/xx 2/3 )dx = 1. (52) 



Let || • || 2 and (-, -) 2 denote norm and scalar product in L2 (M+). Then the saddle point (cf , cr* 2 ) 
is given by 



cr* 2 



d * = lV' ^) = ( A * " /^ 2/? )+- (53) 
II " lb 

Then the value of the game is 

sup inf (d, c 2 ) 2 = inf sup (d, cr 2 ) 2 

d in 09} CT in & °~ in d in g9} 



= <C?*,Cr* 2 > 2 = ||cr* 2 || 2 = yi ^), 

where the sup is taken for d satisfying (pl9l) . the inf is taken for a satisfying (|50l) . and -Ao(/3) 
is Ermakov's constant in ([6|). The continuous saddlepoint problem arises naturally in a 
continuous Gaussian white noise setting and a parameter space described by the continuous 
Fourier transformation, e.g. a Sobolev class of functions on the whole real line (cf. [16J, |17j). 
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The above argument provides the guideline for a more rigourous proof, based on calculating 
the sharp asymptotics of A and [i directly from (I45p . The rough order of A can be found as 
follows. By equating /q = a^ 2 , we find 

[x-^) + = Mh 2 ^a* 2 (hj), 

2/3 



A(l-((M/A) 1/V i) 



we find A x /i 2/3+1 , h X (/// A) 1/2,3 and thus 

A x /i 2 ^ 1 x pM+WPfl) x „-i-V(^+i). (54) 



Remark 8 The paper of Ermakov [9], when calculating the asymptotics of A, /x in fl^5| ) and 
of A = 2Lq (in a more general framework where Yl a jfj — Po> Yl^jfj — P)> contains an 
error for A. Here is the correction using the notations therein. Let a,- = Lj 2 ^ , bj = Mj 2u , 
where 7 > v > 0, L and M are positive constants, and set e = n~ l l 2 . Then as e —> we have 
that 

(27 + 2^ + 1)/ L \ 2(7-,) / 1 \ 2( 7 -) 2C7±i±fl 

(4^ + l)pA _ 4 4 7 - 4^ 
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